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DETAILED ACTION 



Response to Amendment 



1 . The declaration filed on 21 September 2009 under 37 CFR 1.131 has been 
considered but is ineffective to overcome the Lee reference. 

2. The evidence submitted is insufficient to establish a conception of the invention 
prior to the effective date of the Lee reference. While conception is the mental part of 
the inventive act, it must be capable of proof, such as by demonstrative evidence or by 
a complete disclosure to another. Conception is more than a vague idea of how to 
solve a problem. The requisite means themselves and their interaction must also be 
comprehended. See Mergenthaler v. Scudder, 1897 CD. 724, 81 O.G. 1417 (D.C. Cir. 
1897). Although the G. Select. 8 instruction is shown to rearrange data based on a 64-bit 
selector, there is no evidence showing that the elements are provided in parallel to the 
catenated result. Additionally, regarding claims 12 and 25, while the documents 
mention multiply instructions the evidence does not show providing a plurality of 
products as a catenated result. 



Withdrawn Objections 



3. Applicant, via amendment, has overcome the objections of claims 1 2 and 25. 
Consequently, the objections have been respectfully withdrawn 
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Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 



A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351 (a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 



5. Claims 1 , 2, 4-8, 11-15,1 7-21 , 24-26, 40, 41 , 43, 46, 50, 54, 55, 57, 60 and 64 
are rejected under 35 U.S.C. U.S.C. 102(b) as being anticipated by Blelloch (Vector 
Models for Data-Parallel Computing). 

6. Referring to claim 1 , Blelloch discloses, as claimed, a method of processing data 
in a single programmable processor (such as the vector processor shown on page 20) , 
the method comprising: decoding a single instruction ( Inverse-permute: see page 66) 
for selectively arranging data, specifying a data selection operand ( see Vector File 
address format in Fig. 13 ) and a first and a second register ( First and Second halves of 
A; see page 66 regarding inverse-permute ) each having a register width, the first and 
second registers providing a plurality of data elements ( A0-A7 ) each having an 
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elemental width smaller than the register width, the data selection operand comprising a 
plurality of fields ( Each element in Index Vector I; see page 66 ) each selecting one 
( such as 3, 0, 7, 2 and 6 as shown in the last figure of page 66 ) of the plurality of data 
elements ( selecting A3, AO, A7, A2 and A6) : and providing in parallel (see page 60, fig. 
4.1 regarding permute instructions being vector instructions run on a vector processor) 
the data elements selected by the fields ( to the output of the instruction ) to respective 
predetermined positions in a catenated result ( the corresponding element in the result ), 
wherein the predetermined positions are in the same order as the field of the data 
selection operand (see the last figure of page 66) . 
7. 

8. and for each field of the data selection operand, providing the data element 
selected by the field to a predetermined position in a catenated result. Note claims 13, 
14 and 26 recite corresponding limitations as set forth in claim 1 . 

9. As to Claim 26, Blelloch discloses the first register (First half of register A holding 
A0-A3) providing a plurality of data elements (such as elements 0, 2 and 3 see the last 
figure on page 66). 

10. As to claim 2, Blelloch also discloses: the method of claim 1 wherein each field of 
the data selection operand provides a sufficient number of bits to specify any one of the 
plurality of data elements ( See page 66: Note that clearly any value in the values vector 
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can be selected ). Note Claim 15 recites the corresponding limitations as set forth in 
claim 2. 

11. As to claim 4, Blelloch also discloses: the method of claim 1 wherein the data 
selection operand is provided by a register specified by the single instruction ( The 
Indices vector: see page 66 regarding "inverse-permute values indices" ). Note Claim 
17, recites the corresponding limitations as set forth in claim 

12. As to claim 5, Blelloch also discloses: the method of claim 4 wherein the data 
selection operand ( The Indices vector: see page 66 ) has a width equal to the specified 
register width ( See page 66 regarding "The values vector must be equal or longer than 
the indices vector." ). Note Claim 18 recites the corresponding limitations as set forth in 
claim 5. 

13. As to claim 6, Blelloch also discloses: the method of claim 1 wherein the 
catenated result is provided to a register ( Inherently, if a vector instruction is executed, it 
must be stored. Since a register is merely a storage area in a processor, the result 
must be provided to a register ). Note Claim 19 recites the corresponding limitations as 
set forth in claim 6. 



14. As to claim 7, Blelloch also discloses: the method of claim 1 wherein the plurality 
of data elements ( A0-A7 ) has a combined width ( Length of all the values ) equal to the 
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width of the first register ( First half of A ) plus the width of the second register ( Second 
Half of A; Note that since the two registers are merely the two halves of A, they must 
have the same width as A ). Note Claim 20 recites the corresponding limitations as set 
forth in claim 7. 

1 5. As to claim 8, Blelloch also discloses: the method of claim 1 wherein the 
instruction further specifies a data element width of the plurality of data elements ( See 
page 66 regarding the indices vector being variable length: Note that since the indices 
vector which is supplied with the instruction has a variable length, that length must be 
specified by the instruction ). Note Claim 21 recites the corresponding limitations as set 
forth in claim 8. 

16. As to claim 1 1 , Blelloch also discloses that for each field of the data selection 
operand, a relative location of the field within the data selection operand corresponds to 
a relative location of the predetermined position within the catenated result ( See page 
66: last figure ). Note Claim 24 recites the corresponding limitations as set forth in claim 
11. 

17. Referring to claim 12, Blelloch also discloses decoding a second single 
instruction (ApxB: see page 61 : section 4.1 .2) specifying a third (A) and a fourth register 
(B) each containing a plurality of floating-point operands ( See page 169: section 11.1.3; 
second paragraph regarding floating-point numbers ): multiplying the' plurality of floating 
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point operands in the third register by the plurality of operands in the fourth register to 
produce a plurality or of products ( Result of ApxB: see page 61 ); and providing the 
plurality of products to partitioned fields of a result register as a catenated result 
( inherently the result must be stored in a register ). Note Claim 25 recites the 
corresponding limitations as set forth in claim 12. 

18. Regarding claim 40, Blelloch discloses a method of processing data in a 
programmable processor, the method comprising: decoding a single instruction (Inverse 
Permute instruction, see page 66) specifying a plurality of registers (First and Second 
half of A: see page 66) each having a register width ( Inherently, registers must have a 
width ) , the plurality of registers storing a plurality of data elements ( Segments of Data 
vector storing A0-A7) each having an elemental width smaller than the register width 
( Inherently a subsections of the entire vector register must be smaller than the whole ), 
an index register storing an index vector (Index Vector I: see page 66) comprising a 
plurality of indices stored in partitioned fields of the index register (each selector in the 
index vector) and a destination register ( Inverse-Permute(A.I): see page 66: Note that 
the operation must have a destination ); wherein each index in the index vector 
comprises a sufficient number of bits to represent a range of possible index values (See 
page 66: Note that clearly any value in the values vector can be selected) , the range of 
possible index values including a different index value for each of the plurality of data 
elements stored in the plurality of registers, allowing the index to select any data 
element from the plurality of data elements stored in the plurality of registers ( 0-7: See 
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page 66 regarding Inverse-permute instruction ); wherein each index in the index vector 
independently selects one of the data elements from the plurality of data elements 
stored in the plurality of registers ( See last paragraph of page 66 ): and for each index in 
the index vector, providing a data element selected by the index to a predetermined 
position (Each position in the result corresponding to the index, see page 66) in the 
destination register. Note Claims 50, 54 and 64 recite the corresponding limitations as 
set forth in claim 40. 

1 9. Regarding claim 41 , Blelloch also discloses the plurality of registers comprises 
two registers (First and Second halves of A; additionally, any plurality of registers will 
inherently comprise two registers since it is a plurality) . Note Claim 55 recites the 
corresponding limitations as set forth in claim 41 . 

20. Regarding claim 43, Blelloch also discloses the number of selectors stored in the 
index register is equal to the number of predetermined positions in the destination 
register (see last figure of page 66) . Note Claim 57 recites the corresponding limitations 
as set forth in claim 43. 

21 . Regarding claim 46, Blelloch also discloses the index stored in a lowest order set 
of bits of the index register provides a data element to a lowest order set of bits of the 
destination register, the index in a second lowest order set of bits of the index register 
provide a data element to a second lowest order set of bits of the destination register 
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and the index stored in a highest order set of bits of the index register provides a data 
element to a highest order set of bits of the destination register (see last figure in page 
66: as shown, the nth element in the index corresponds to the nth element in the 
destination) . Note Claim 60 recites the corresponding limitations as set forth in claim 46. 



22. Claims 1, 11, 14 and 24 are rejected under 35 U.S.C. U.S.C. 102(e) as being 
anticipated by Lee (U.S. Patent No. 6,381,690). 

23. Referring to claim 1 and 14, Lee discloses, as claimed, a method of processing 
data in a single programmable processor the method comprising: decoding a single 
instruction ( performed in figure 1 ) for selectively arranging data, specifying a data 
selection operand ( Order word 26: see figure 1 ) and a first and a second register ( Items 
1 -2 and Items 3-4: see figure 1 ) each having a register width (inherently, registers have 
a width), the first and second registers providing a plurality of data elements ( Items 1-4 ) 
each having an elemental width smaller than the register width (inherently, parts are 
smaller than the whole), the data selection operand comprising a plurality of fields ( see 
figure 1 regarding the 4 sections of Order Word ) each selecting one ( using multiplexers 
41-44: see figure 2 ) of the plurality of data elements; and providing in parallel (see fig. 2 
regarding the multiplexers) the data elements selected by the fields ( 01-04: see figure 
2) to respective predetermined positions in a catenated result 01 selects T1 , 02 selects 
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T2,, etc.; see figure 2) , wherein the predetermined positions are in the same order as 
the field of the data selection operand (see fig. 2) . 

24. Referring to claims 1 1 and 24, Lee also discloses that for each field of the data 
selection operand, a relative location of the field within the data selection operand 
corresponds to a relative location of the predetermined position within the catenated 
result ( 01 selects T1 , 02 selects T2,, etc.; see figure 2) . 

Claim Rejections - 35 USC § 103 

25. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 12 and 25 are rejected under 35 U.S.C. 103(a) as being unpatentable over Lee 
in view of Matsuura (US Patent No. 4,725,973) herein referred to as Matsuura. 

Referring to claims 12 and 25, Lee does not expressly disclose decoding a 
second single instruction specifying a third and a fourth register each containing a 
plurality of floating-point operands; multiplying the' plurality of floating point operands in 
the third register by the plurality of operands in the fourth register to produce a plurality 
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or of products; and providing the plurality of products to partitioned fields of a result 
register as a catenated result. 

Matsuura teaches decoding a second single instruction (Vector Multiply) 
specifying a third (VR 1) and a fourth register (VR 1); multiplying the' plurality of floating 
point operands in the third register by the plurality of operands in the fourth register to 
produce a plurality or of products; and providing the plurality of products to partitioned 
fields of a result register (VR 3) as a catenated result (See col. 2, lines 5-20). 

At the time of the invention, it would have been obvious for one of ordinary skill in 
the art to have modified the invention of Lee by using a Vector multiply instruction, as 
taught by Matsuura, resulting in predictable results for the purpose of increasing 
flexibility and performance of SIMD processing. 

26. Claims 3, 9, 10, 16, 22, 23, 42, 44, 45, 47-49, 51-53, 56, 58, 59,61-63 and 65-67 
are rejected under 35 U.S.C. 103(a) as being unpatentable over Blelloch (Vector 
Models for Data-Parallel Computing) in view of In re Rose, 105 USPQ 237 (CCPA 
1955). 

Blelloch does not expressly disclose the data elements and the predetermined 
positions are 8-bit (Claims 42, 9, 22, 50 and 64), the selectors are equal-sized (Claims 
45, 50, 59 and 64) 4-bit (n) elements (Claims 3, 16, 48, 49, 53, 62, 63 and 67), the first 
and second registers are 64-bit registers (Claims 42, 51 , 56 and 65), the index register 
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is 64-bit (Claims 44, 51, 58 and 65) the destination register is 128-bit (Claims 10, 23, 47, 
52, 61 and 66) and there are 16 (2 n ) data elements (Claims 3, 16, 42, 49, 56 and 63). 



In re Rose has shown that changes in size, such as change in the size of the 
data, is not generally given patentable weight or would have been obvious 
improvements. Hence, it would have been obvious at the time of the invention for one 
of ordinary skill in the art to have modified the invention of Blelloch, by making the 
predetermined positions 8-bit, the selectors equal-sized 4-bit elements, the first and 
second registers 64-bit registers, the index register 64-bit and the destination register 
128-bit, as in re rose has shown to be obvious. Functionally, the size or number of the 
registers (elements) makes no difference to the overall operation of the system. Simply 
adding or removing a few addressing bits does not render a computer system novel. In 
this case, one of ordinary skill in the art would have found it obvious to use 4 bits to 
address the elements in the vector. 



MPEP 2141 reads, in part, as follows: 

The Supreme Court in KSR reaffirmed the familiar framework for determining 
obviousness as set forth in Graham v. John Deere Co. (383 U.S. 1, 148 USPQ 
459 (1966)), but stated that the Federal Circuit had erred by applying the 
teaching- suggestion-motivation (TSM) test in an overly rigid and formalistic way. 
KSR, 550 U.S. at, 82 USPQ2d at 1391. Specifically, the Supreme Court stated 
that the Federal Circuit had erred in four ways: (1 ) "by holding that courts and 
patent examiners should look only to the problem the patentee was trying to 
solve" (Id. at_ 82 USPQ2d at 1397); (2) by assuming "that a person of ordinary 
skill attempting to solve a problem will be led only to those elements of prior art 
designed to solve the same problem" (Id.); (3) by concluding "that a patent claim 
cannot be proved obvious merely by showing that the combination of elements 
was obvious to try'" (Id.); and (4) by overemphasizing "the risk of courts and 
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patent examiners falling prey to hindsight bias" and as a result applying "[r]igid 
preventative rules that deny factfinders recourse to common sense" (Id.). 

In KSR, the Supreme Court particularly emphasized "the need for caution in 
granting a patent based on the combination of elements found in the prior art," Id, 

at 82 USPQ2d at 1395, and discussed circumstances in which a patent might 

be determined to be obvious. Importantly, the Supreme Court reaffirmed 
principles based on its precedent that "the combination of familiar elements 
according to known methods is likely to be obvious when it does no more than 
yield predictable results. "Id, at 82 USPQ2d at 1395. 

The Supreme Court further stated that: 

When a work is available in one field of endeavor, design incentives and other 
market forces can prompt variations of it, either in the same field or a different 
one. If a person of ordinary skill can implement a predictable variation, § 103 
likely bars its patentability. For the same reason, if a technique has been used to 
improve one device, and a person of ordinary skill in the art would recognize that 
it would improve similar devices in the same way, using the technique is obvious 

unless its actual application is beyond his ordinary skill. Id. at 82 USPQ2d at 

1396. When considering obviousness of a combination of known elements, the 
operative question is thus "whether the improvement is more than the predictable 

use of prior art elements according to their established functions." Id. at 82 

USPQ2d at 1396. 



MPEP 2144.04 A reads, in part, as follows: 

In re Rose , 220 F.2d 459, 105 USPQ 237 (CCPA 1955) (Claims directed to a 
lumber package "of appreciable size and weight requiring handling by a lift truck" 
where held unpatentable over prior art lumber packages which could be lifted by 
hand because limitations relating to the size of the package were not sufficient to 
patentably distinguish over the prior art.); In re Rinehart, 531 F.2d 1048, 189 
USPQ 143 (CCPA 1976) ("mere scaling up of a prior art process capable of 
being scaled up, if such were the case, would not establish patentability in a 
claim to an old process so scaled." 31 F.2d at 1053, 189 USPQ at 148.). 

In Gardner v. TEC Systems, Inc., 725 F.2d 1338, 220 USPQ 777 (Fed. Cir. 
1984), cert, denied, 469 U.S. 830, 225 USPQ 232 (1984), the Federal Circuit 
held that, where the only difference between the prior art and the claims was a 
recitation of relative dimensions of the claimed device and a device having the 
claimed relative dimensions would not perform differently than the prior art 
device, the claimed device was not patentably distinct from the prior art device. 
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All the elements necessary to produce applicants' invention were known in the 
art. How one combined such elements to produce applicants' invention was also known 
in the art. Evidence of this is that applicants' disclosure lacks any detailed description of 
unique technology necessary to implement applicants' invention. One of ordinary skill 
would have readily recognized that the results of the combination were predictable. 
Absent some secondary considerations, not in evidence at this time, applicants 
invention is obvious over the combination of prior art presented. Increasing the number 
or size of the registers does not change the processor functionally. Realistically, the 
number of vector registers would most likely be relatively small. The size of the indexes 
used for addressing the registers would be log 2 n bits wherein n is the number of 
addressable registers. Therefore, in order to have a reasonable system a very small 
number of bits would be used to address registers and one of ordinary skill in the art 
would have found a size of 4 bits (16 elements addressable) to be obvious because 
there are a limited number of small integers. Additionally, the actual size of registers is 
normally a power of 2, therefore there are only a few reasonable sizes for registers (1 , 
2, 4, 8, 1 6, 32, 64, 1 28...). All of these values are extremely common in the art and 
would have been obvious to substitute into Blelloch's system. 



Response to Arguments 
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27. Applicant's arguments filed 21 September have been fully considered but they 
are not persuasive. 

28. Regarding argument A.1 ., Examiner respectfully disagrees. The claims require 
that data elements are independently selected by each index. This language is 
interpreted as requiring that each selection is performed independently based on only 1 
index. This is in contrast with many selection schemas such as shift, extract or shuffle 
which uses a selector to dependency rearrange multiple elements. Applicant has 
classified the G.SELECT.8 as a "Group Permute" function (See Exhibit 1 , page 80, filed 
8 January 2009). Permutation is merely reordering and would therefore require that all 
of the indexes be unique (in order to be a "permute" function). Furthermore, the 
specification does not mention or support duplicating data or explain how the indexes 
are selected. If the claim were to be interpreted in this manner, the specification would 
not provide support under 35 U.S.C. 112, first paragraph. 

29. Regarding argument A.2., Examiner respectfully disagrees. The term processor 
is broadly and reasonably interpreted as a machine in which processing is done (by 
instructions). It is common in the art to refer to combined systems as a single 
processor. For example, the term "dual-core processor" is well known in the art to 
contain 2 processors. Additionally, as shown in Applicants' specification (see fig. 3) part 
of the entire system contains many addition and multiplication processors to perform 
matrix multiplication. Even if the limitation is interpreted as requiring a single vector 
processor, Blelloch discloses the use of a V-RAM or vector processing model (see page 
20, Figure 2.1) in addition to other models. The instructions discussed can be 
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implemented with any of the models. Furthermore, vector processors are extremely well 
known in the art and Sakata et al. (U.S. Patent No. 4,734,877) describes one in detail. 

Conclusion 

30. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to JESSE R. MOLL whose telephone number is (571)272- 
2703. The examiner can normally be reached on M-F 10:00 am - 6:30 pm EST. If 
attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Alford Kindred can be reached on (571)272-4037. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Alford W. Kindred/ Jesse R Moll 

Supervisory Patent Examiner, Art Unit 2181 Examiner 

Art Unit 2181 



/J. R. M./ 

Examiner, Art Unit 2181 



